Query Scrambling for Bursty Data Arrival
نویسندگان
چکیده
Distributed databases operating over wide-area networks, such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote sources may vary widely due to network congestion, link failure, and other problems. In this paper we examine a new class of methods, called query scrambling, for dealing with unpredictable response times. Query scrambling dynamically modiies query execution plans on-they in reaction to unexpected delays in data access. We explore various choices in the implementation of these methods and examine, through a detailed simulation, the eeects of these choices. Our experimental environment considers pipelined and non-pipelined join processing in a client with multiple remote data sources and it focuses on bursty arrivals of data. We identify and study a number of the basic trade-oos that arise when designing scrambling policies for the bursty environment. Our performance results show that query scrambling is eeective in hiding the impact of delays on query response time for a number of diierent delay scenarios.
منابع مشابه
{32 () Dynamic Query Operator Scheduling for Wide-area Remote Access *
Distributed databases operating over wide-area networks such as the Internet, must deal with the unpredictable nature of the performance of communication. The response times of accessing remote sources can vary widely due to network congestion, link failure, and other problems. In such an unpredictable environment, the traditional iterator-based query execution model performs poorly. We have de...
متن کاملXJoin Getting Fast Answers From Slow and Bursty Networks
The combination of increasingly ubiquitous Internet connectivity and advances in heterogeneous and semi structured databases has the potential to enable database style querying over data from sources distributed around the world Traditional query processing techniques however fail to deliver acceptable performance in such a scenario for two main reasons First they optimize for delivery of the e...
متن کاملQuery Scrambling in Distributed Multidatabase Systems
This work addresses the problem of efficient query processing in multidatabase systems distributed over widearea networks. The solution unifies the query scrambling and reduction approaches to dynamic optimization of query processing plans at the data integration stage. The paper presents a new data integration algorithm based on query scrambling and the extended reduction technique. The algori...
متن کاملXJoin: A Reactively-Scheduled Pipelined Join Operator
Wide-area distribution raises significant performance problems for traditional query processing techniques as data access becomes less predictable due to link congestion, load imbalances, and temporary outages. Pipelined query execution is a promising approach to coping with unpredictability in such environments as it allows scheduling to adjust to the arrival properties of the data. We have de...
متن کاملA Framework For Supporting Load Shedding in Data Stream Management Systems
The arrival rate of tuples in a data stream can be unpredictable and bursty. Many stream-based applications have Quality of Service (QoS) requirements that need to be satisfied by the underlying stream processing system. In order to avoid violating predefined QoS requirements during temporary overload periods, a load shedding strategy is necessary and critical for a data stream management syste...
متن کامل